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Abstract 

The  design  of  a  module  system  for  constructing  and  maintaining  large  programs  is  a  difficult  task  that 
raises  a  number  of  theoretical  and  practical  issues.  A  fundamental  issue  is  the  management  of  the  flow  of 
information  between  program  units  at  compile  time  via  the  notion  of  an  interface.  E.xperience  has  shown  that 
fully  opaque  interfaces  are  awkward  to  use  in  practice  since  too  much  information  is  hidden,  and  that  fully 
transparent  interfaces  lead  to  excessive  interdependencies,  creating  problems  for  maintenance  and  separate 
compilation.  The  “sharing”  specifications  of  Standard  ML  address  this  issue  by  allowing  the  programmer  to 
specify  equational  relationships  between  types  in  separate  modules,  but  are  not  expressive  enough  to  allow 
the  programmer  complete  control  over  the  propagation  of  type  information  between  modules. 

These  problems  are  addressed  from  a  type-theoretic  viewpoint  by  considering  a  calculus  based  on  Llirard's 
system  F,^.  The  calculus  differs  from  those  considered  in  previous  studies  by  relying  e.xclusively  on  a  new 
form  of  weak  sum  type  to  propagate  information  at  compile-lime,  in  contrast  to  approaches  based  on  strong 
sums  which  rely  on  substitution.  The  new  form  of  sum  type  allows  for  the  specification  of  equational,  as  well 
as  type  and  kind,  information  in  interfaces.  This  provides  complete  control  over  the  propagation  of  compile¬ 
time  information  between  program  units  and  is  sufficient  to  encode  in  a  straightforward  way  most  uses  of 
type  sharing  specifications  in  Standard  ML.  Modules  are  treated  as  ''first-class''  citizens,  and  therefore  the 
system  supports  higher-order  modules  and  some  object-oriented  programming  idioms;  the  language  may  be 
eeisily  restricted  to  “second-class”  modules  found  in  ML-like  languages. 


DTIC  QUALITY  IHSPTiCTED  3 


1  Introduction 


Modularity  is  an  essential  technique  for  developing  and  maintaining  large  software  systems  [46,  24,  36].  Most 
modern  programming  languages  provide  some  form  of  module  system  that  supports  the  construction  of  large 
systems  from  a  collection  of  separately-defined  program  units  [7,  8,  26,  32].  A  fundamental  problem  is  the 
management  of  the  tension  between  the  need  to  treat  the  components  of  a  large  system  in  relative  isolation 
(for  both  conceptual  and  pragmatic  reasons)  and  the  need  to  combine  these  components  into  a  coherent 
whole.  In  typical  cases  this  problem  is  addressed  by  equipping  each  module  with  a  well-defined  interface 
that  mediates  all  access  to  the  module  and  requiring  that  interfaces  be  enforced  at  system  link  time. 

The  Standard  ML  (SML)  module  system  [i7,  32]  is  a  particularly  interesting  design  that  has  proved  to  be 
useful  in  the  development  of  large  software  systems  [2,  1,3,  11,  13].  The  main  constituents  of  the  SML  module 
system  are  signatures,  structures,  and  functors,  with  the  latter  two  sometimes  called  modules.  A  structure  is 
a  program  unit  defining  a  collection  of  types,  exceptions,  values,  and  structures  (known  as  sub-.strnrfure.s  of 
the  structure).  A  functor  may  be  thought  of  as  a  "parameterized  structure",  a  first-order  function  mapping 
structures  to  structures.  A  signature  is  an  interface  describing  the  constituents  of  a  structure  —  the  ty])es. 
values,  exceptions,  and  structures  that  it  defines,  along  with  their  kinds,  types,  and  interfaces.  See  Figure  1 
for  an  illustrative  example  of  the  use  of  the  SML  module  system:  a  number  of  sources  are  available  for 
further  examples  and  information  [15,  39]. 

A  crucial  feature  of  the  SML  module  system  is  the  notion  of  type  sharing^  which  allows  for  the  specification 
of  coherence  conditions  among  a  collection  of  structures  that  ensure  that  types  defined  in  separate  motlules 
coincide.  The  classic  example  (adapted  from  MacQueen)  is  the  construction  of  a  parser  from  a  lexer  and 
a  symbol  table,  each  of  which  make  use  of  a  common  notion  of  symbol  (see  Figure  2).  The  parser  is 
constructed  by  a  functor  that  takes  as  arguments  two  modules,  a  lexer  and  a  symbol  table  manager.  I'lie 
parser  composes  functions  from  the  lexer  and  symbol  table  manager:  the  composition  is  well-typed  only 
if  the  two  modules  "share”  a  common  notion  of  symbol.  Within  the  body  of  the  functor  Parser  the 
types  L.S. symbol  and  T.S. symbol  coincide,  as  specified  by  the  type  sharing  specification  in  the  parameter 
signature.  See  MacQueen ’s  seminal  paper  for  further  examples  and  discussion  [26.  17]. 

1,1  Transparency  and  Opacity 

Module  bindings  in  SML  are  "transparent”  in  the  sense  that  the  type  components  of  a  module  an'  fully 
visible  in  the  scope  of  the  binding.  For  example,  the  structure  declaration 

structure  S  =  struct 
type  t  =  int 
type  u  =  t  ->  t 
val  i  =  In  x:t  =>  x 
end 

introduces  a  structure  variable  S  with  type  components  S.t  and  S.u  and  value  component  S.f.  Within 
the  scope  of  S,  the  type  S.t  is  equivalent  to  the  type  int  and  the  type  S.u  is  etpiivahMit  to  the  lyp<'  int- 
>int.  These  equivalences  are  not  affected  by  the  ascription  of  a  signature  to  the  binding.  For  example,  the 
signature  SIG  defined  by  the  declaration 

signature  SIG  =  sig 
type  t 
type  u 
val  f  :  u 
end 

may  i)e  correctly  ascribed  to  the  structure  S  without  obscuring  the  bindings  of  S.t  and  S.u. 

Functor  bindings  are  similarly  "transparent”  in  that  the  type  compoiu'iits  of  th(>  result  of  any  application 
of  the  functor  are  fully  visible  within  the  scope  of  the  functor  binding.  For  example,  considi'r  the  following 
functor  declaration: 


'The  closely-related  notion  of  .^trurturr  .^haring  is  not  eonsidererl  in  this  paper. 
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signature  SYMBOL  »  sig 
type  symbol 

val  intern  :  string  ->  synbol 
val  pname  :  symbol  ->  string 
val  eq  :  symbol  •  symbol  ->  bool 
end 

structure  Symbol  :  SYMBOL  »  struct 
structure  HashTable  :  HASH-TABLE  >  . . . 
type  symbol  °  HashTable. hashJcey 
fun  intern  id  « 

HashTable . enter  (HashTable. hash  id)  id 
fun  pname  sym  >  HashTable. retrieve  syn 
fun  eq  (si,  a2)  >  HashTable .sameJcey  si  s2 
end 


Figure  1;  Example  of  the  SML  Module  System 


signature  LEXER  =  sig 
structure  S  :  SYMBOL 
type  token 


end 

signature  SYMTAB  =  sig 
structure  S  ;  SYMBOL 

end 

signature  PARSER  =  sig 
structure  L  :  LEXER 

val  parse  :  string  ->  Lexer. token  stream 


end 

functor  Parser 

(structure  L:LEXER  euid  T'.SYMTAB 
sharing  type  L. S.synbol=T.S. synbol) :PARSER= 
struct 


end 


Figure  2:  Sharing  Specifications 
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functor  F(structure  X:SIG):SIG  =  struct 
typo  t  =  X.t  ♦  X.t 
type  u  =  X.u 
val  f  =  X.i 
end 

The  bindings  of  the  t  and  u  components  of  any  application  of  F  are  fully  visible  as  a  function  of  the  t  and 
u  components  of  the  parameter  X.  For  example,  within  the  scope  of  the  declaration 

structure  T:SIG  =  F(S) 

the  type  T.t  is  equivalent  to  the  type  int*int  and  the  type  T.u  is  equivalent  to  the  type  int->int. 

It  is  possible  only  to  a  very  limited  extent  in  SML  to  specify  in  a  signature  the  bindings  of  the  types  in 
a  module.  For  example,  we  may  augment  the  signature  SIG  with  a  sharing  specification  to  specify  that  t 
is  int  as  follows: 

signature  SIG'  =  sig 
type  t 

sharing  type  t=int 
type  u 
val  f  ;  u 
end 

This  method  cannot  be  extended  to  specify  the  binding  of  u  —  sharing  specifications  may  only  involve 
type  names,  not  general  type  expressions.  To  fully  determine  the  type  bindings  in  S  requires  a  tmnspareiil 
signature: 

signature  FULL^IGJS  =  sig 
type  t  =  int 
type  u  =  int  ->  int 
val  f  :  int  ->  int 
end 

Note  that  this  is  not  a  legal  SML  signature  because  of  the  type  equations.  There  is  no  way  to  express 
equations  such  as  u  =  int  ->  int  in  SML  signatures.  The  signature  FULLJSIG^  is  the  "fuH'’  signature  of 
S  since  it  completely  determines  the  bindings  of  its  type  components. 

The  importance  of  transparent  signatures  only  becomes  apparent  when  we  consider  functors  and  the 
closely-related  abstraction  bindings  suggested  by  MacQueen  [26.  17].  Functor  parameters  are  opaque 
in  the  sense  that  the  ascribed  signature  is  the  sole  source  of  type  information  for  those  parameters  (this 
property  is  the  beisis  for  the  reduction  of  abstraction  bindings  to  functor  applications).  Fine  control  over 
the  "degree"  of  opacity  of  a  functor  parameter  can  be  achieved  by  admitting  transparent,  or.  more  generally. 
translucent,  signatures  that  allow  for  the  partial  (possibly  full)  determination  of  the  type  components  of  a 
module.  For  example,  the  translucent  signature 

signattire  PARTIAL_SIG_S_1  =  sig 
type  t 

type  u  =  t  ->  t 
val  i  :  u 
end 

is  a  partially  transparent  signature  that  leaves  the  type  t  unconstrained,  but  d('t('rmin<'s  u  up  to  the  choice 
of  t.  The  structure  S  matches  the  signature  PARTIAL_SIG-S-1  since  S.u  is  equal  to  S.t->S.t  which  is  its<>lf 
equal  to  int->int.  Conversely,  the  translucent  signature 

signature  PARTIAL-SIGJ5-2  =  sig 
type  t  =  int 
tjrpe  u 
val  f  :  u 
end 


determines  t,  leaving  u  unconstrained.  This  signature  is  essentially  equivalent  to  the  signature  SIG’  above." 

S  matches  this  signature. 

Translucent  signatures  are  particularly  useful  in  connection  with  higher-order  functors  [43].  Using  equa¬ 
tions  in  signatures  it  is  possible  to  specify  the  dependency  of  the  functor  result  on  the  functor  argument. 
For  example,  a  natural  signature  for  the  functor  F  defined  above  is  the  following  functor  signature  which 
fully  determines  the  type  components  of  F; 

lunsig  FULL-SIG_F  = 

(structure  X.SIG): 

sig  type  t=X.t*X.t  type  u=X.u  val  f:u  end 

Another,  less  precise,  signature  for  F  is  the  functor  signature 

lunsig  PARTI AL-SIG_F  = 

(structure  X:SIG): 
sig  type  t  type  u=X.u  val  l;u  end 

which  only  specifies  the  behavior  of  F  on  the  component  u,  leaving  its  behavior  on  t  unspecified. 

Higher-order  functors  [43]  are  particularly  important  in  connection  with  separate  compilation.  A  separately- 
compiled  module  may  be  represented  by  a  variable  whose  signature  is  the  "fuH"  transparent  signature  of 
the  module  itself  [4*2].  By  abstracting  the  client  module  with  respect  to  these  variables  we  obtain  a  (pos¬ 
sibly  higher-order)  functor  whose  application  models  the  process  of  linking  the  clients  of  the  module  with 
its  actual  implementation.  The  signature  matching  process  ensures  that  the  presumed  signature  of  the 
separately-compiled  module  is  consistent  with  the  module  itself  so  as  to  guarantee  type  safety.  The  full 
signature  of  the  separately  compiled  module  is  necessary  to  ensure  that  separate  and  combined  compilation 
yield  the  same  result. 

1.2  Static  Semantics 

The  static  semantics  of  SML  [3*2]  is  defined  by  a  collection  of  complex  elaboration  rules  that  specify  the 
static  well-formedness  conditions  for  .SML  programs.  The  main  techniques  employed  in  the  .semantics  are 
the  use  of  “unique  names"  (or  ’‘generativity" )  to  handle  abstraction  and  sharing  specifications,  and  the  use 
of  non-deterministic  rules  to  handle  polymorphism,  sharing  specifications,  and  signature  matching.  The 
static  semantics  has  proved  useful  as  a  guide  to  implementation  [*28,  *2.  41.  10],  but  is  remarkably  difficult 
to  modify  or  extend  (see,  for  example,  [43]).  The  naive  attempt  to  enrich  signatures  as  sketched  aliove  is 
incompatible  with  the  crucial  “principal  signature"  property  [31].  But  it  is  not  clear  whether  this  failure  is 
a  symptom  of  an  intrinsic  incoherence  in  the  language,  or  is  merely  an  artifact  of  the  semantic  method.'* 

In  an  effort  to  gain  some  insight  into  the  complexities  of  the  static  semantics  .several  authors  have 
undertaken  a  type-theoretic  analysis  of  SML.  especially  its  module  system  [*27.  33,  3r).  7,  *20,  19],  Previous 
studies  of  the  module  system  focused  on  the  transparency  of  SML-style  structures  through  the  use  of  "strong 
sum"  or  “dependent  product”  types.  These  are  types  of  the  form  whose  elements  are  pairs 

(Mi.iWo)  that  are  accessed  via  projections  Xi(A/)  and  irn(M).  Tlx-  crucial  properties  of  strong  sums  [2!)] 
are  that  if  M  :  Hx.A.Blx),  then  Trn(M)  :  S(xi(.V/)),  and  that  .  M->))  =  .\/|.  Together,  these 

properties  ensure  that  type  information  is  propagated  in  rough  accord  with  the  SML  static  semantics. 
(See  [27,  33,  20,  19]  for  further  discussion.)  Substittition-ltased  methods  are  problematic  in  the  presence 
of  computational  effects,  unless  care  is  taken  to  account  for  the  phase  distinction  [20],  Moreover,  strong 
sums  fail  to  account  for  sharing  specifications  and  the  abstract  nature  ( "generativity” )  of  structure  and 
datatype  bindings. 

In  this  paper  we  extend  the  type-theoretic  analysis  of  SML-like  module  systems  by  presenting  a  calculus 
with  the  following  features: 


^  It  seems  plausible  that  most  uses  of  type  sharing  specifiewtions  may  be  .u  eoiinterl  for  in  this  way,  provided  tliiit  we  neglei  l 
local  specifications  and  re-binding  of  variables  in  specifications,  both  of  which  .ire  of  <|iiestionable  utility. 

^  Based  on  the  approach  taken  here,  and  a  related  idea  <lue  to  l.,en»y,  lofte  hiis  recently  tievised  a  way  to  accuinnio<lat<’  a 
form  of  type  abbreviation  in  Standard  MLs  ignatures  [d'l). 
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Kinds 

K 

::= 

n  1  a'=4'A" 

Constructors 

A 

::= 

o  1  nx:A.A'  1  {D, . D„}  1 

Ao::A'.A  1  A  A'  1  V.h 

Declarations 

D 

::= 

h>  a::K  |  b  t>  a;:A'=A  |  y  o  i:A 

Terms 

M 

::  = 

I  1  Xx-.A.M  1  MM'  1  M:A  | 
{5i,...,fl„}  1  M.y 

Bindings 

B 

::= 

bt>  a!=A  1  y  &  x=M 

Values 

V 

X  1  Ax;A.M  1  {By, . flv„}  | 

V.y 

By 

::= 

bc>  o=A  1  y  p  x=V 

Contexts 

r 

•  1  r,o::A-  1  r,a::A=A  [  r,x:A 

Figure  3:  Syntax  rules  (n  >  0) 


•  TVanslucent  sum  types,  which  generalize  weak  sums  by  providing  labeled  fields  and  equations  governing 
type  constructors.  These  mechanisms  obviate  the  need  for  substitution,  and  account  for  abstraction 
and  common  uses  of  SML-style  sharing  specifications. 

•  A  notion  of  subsumption  that  encompasses  a  “coercive"  pre-order  associated  with  record  fields  and  a 
“forgetful”  pre-order  associated  with  equations  that  represent  sharing  information, 

•  Treatment  of  modules  as  “first-class”  values.  The  typing  rules  ensure  that  visibility  of  compile-time 
components  is  suitably  restricted  when  run-time  selection  is  used  (see  also  [34]).  If  run-time  .selection 
is  not  used,  modules  behave  exactly  as  they  would  in  a  more  familiar  “second-class"  module  system 
such  as  is  found  in  SML. 

Our  calculus  improves  on  previous  work  by  providing  a  much  greater  degree  of  control  over  the  propagation 
of  type  information  at  compile  time  so  that  we  can  achieve  the  effect  of  SML-like  sharing  specifications  and 
provide  direct  support  for  abstraction. 

2  Overview  of  the  Calculus 

Our  system  is  based  on  Girard’s  [14]  in  much  the  same  way  that  many  systems  are  based  on  the  .second- 
order  lambda  calculus  (Ft).  That  is  to  say,  our  system  can  be  (roughly)  thought  of  as  being  obtained 
from  Fu  by  adding  more  powerful  constructs  (translucent  sums  anil  ilependent  functions)  and  a  notion  ot 
subtyping  and  then  removing  the  old  constructs  (quantification'  (V),  weak  sums  (3),  and  non-dependent 
functions  ( — ))  superseded  by  the  new  ones.  Subtyping  interacts  with  the  rest  of  the'  calculus  via  implicit 
subsumption.  Bounded  quantification  is  not  supported. 

Like  F^,  our  system  is  divided  into  three  levels;  terms,  (type)  constructors,  and  kinds.  Kimls  classify 
constructors,  and  a  subset  of  constructors,  called  types,  having  kind  ft  classify  terms.  The  kind  level 
is  necessary  because  the  constructor  level  contains  functions  on  constructors.  Example:  the  constructor 
Aa;;ft.a — a  has  kind  ft^^ft  and  when  applied  to  type  Int  yields  Int — Int. 

The  syntax  rules  for  our  system  are  given  in  Figure  3.  The  meta  variable  i»  ranges  over  constructor 
variables  and  the  meta  variable  x  ranges  over  term,  vartables.  The  meta  variable  b  ranges  over  constructor 
field  names  and  the  meta  variable  y  ranges  over  term  field  names.  We  have  placed  field  names  in  bold  In 
order  to  help  emphasize  that  they  are  names,  not  variables.  The  complete  typing  rules  appear  in  appendix 


* QuantiRcation  is  Jerivable  from  dependent  functions  and  treinsliicenl  sums.  The  basic  i<lcii  is  to  ti.aiisforni  M 

into  Ax:{b& ot::A'}.  [a:.b/o]Af  where  x  is  not  free  in  M  and  M  [.-t]  into  Xt  {bi>or=.-\}.  Note  llial  this  iiupleinents  lonstructor 
abstraction  tts  a  delaying  operation  unlike  the  normal  SML  semantics.  Site  [Hi]  for  a  dismission  of  the  differences  between  these 
two  interpretations  of  constructor  abstractions  and  why  this  choice  seems  to  be  preferable. 


Our  handling  of  dependent  functions  (Xx.A.M)  is  standard  [29]  except  that  our  elimination  rule  only 
allows  for  the  application  of  functions  having  arrow-types®  (non-dependent  function  types).  Tlie  iiornial 
elimination  rule  for  dependent  functions  does  not  have  this  restriction,  requiring  functions  only  to  have  a  fl- 
type.  We  make  this  restriction  because  we  intend  to  extend  this  system  in  future  work  with  effect-producmg 
primitives.  In  the  presence  of  effects,  the  unrestricted  rule  is  unsound  because  of  interactions  with  tne 
first-class  nature  of  translucent  sums.” 

Trcuislucent  sums  ({Sj , . . . ,  Bn}),  the  central  contribution  of  the  calculus,  will  be  discussed  at  length  in 
the  next  section.  Very  briefly,  they  are  n-ary  labeled  dependent  sums  whose  types  can  optionally  contain 
formation  about  the  contents  of  their  constructor  fields.  Traditional  records  and  weak  sums  (existentials) 
are  degenerate  forms  of  translucent  sums. 

A  mechanism  (written  M:A)  for  forcing  a  term  M  to  be  coerced  to  a  .supertype  .1  is  [irovided.  The 
subtyping  relation  allows  for  both  generalized  record  subtyping  [7,  (5,  9]  (fields  which  are  not  depeiuled  i>n 
by  the  other  fields  in  a  translucent  sum  may  be  dropped)  and  for  the  forgetting  of  information  about  tlie 
contents  of  constructor  fields. 

There  are  two  basic  forms  of  constructor  definitions.  A  constructor  definition  is  upaqut  if  within  tin' 
scope  of  the  definition  there  is  no  information  available  about  the  identity  of  the  oonslriictor  variable  being 
defined.  By  contrast,  a  transparent  constructor  definition  makes  available  the  identity  of  the  constructor 
variable  that  is  being  defined. 

Contexts  in  our  system  can  contain  both  opaque  (o::/\)  and  transparent  (a::/v=.l)  definitions.  The 
effect  of  transparency  is  implemented  by  a  typing  rule  (ABBREV’)  for  the  constructor  equality  relation 
judgement  that  establishes  that  a  =  .d  when  a::K=A  is  in  the  context.  (This  is  similar  to  mechanisms  used 
in  AUTOMATH  and  LEGO  [12,  45,  25].) 

Our  calculus  is  intended  to  be  interpreted  using  a  call-by-value  s<'mantics.  The  typing  system  is  not 
sound  for  call-by-name  in  the  presence  of  effects.'  We  restrict  terms  in  constructors  to  values  in  order  to 
avoid  the  problem  of  trying  to  give  a  meaning  to  a  constructor  containing  a  side-elfecting  lerm."^  In  our 
system,  values  (V')  are  considered  to  be  term  variables,  term  A-expre.ssions.  translucc'ut  sums  containing  only 
values  and  constructors,  and  selections  of  term  fields  from  values.  We  allow  I’.y  to  be  ,i  value  in  order  to 
allow  paths  like  x.y.y'  to  be  values  while  still  keeping  the  set  of  values  clo.sed  under  the  substitution  of  a 
value  for  a  term  variable. 

3  Translucent  Sums 

A  translucent  sum  has  the  form  of  a  possibly  empty  sequence  of  hiiidtiigs  written  betwet'ii  curly  brar(‘s 
({Bi, . . . ,  Bn}).  The  corresponding  translucent  sum  type  is  similar  except  that  (^((■larnttt)ns  are  ust-d  instead 
of  bindings  ( {  Di , . . . ,  Dn } ) . 

Translucent  sums  differ  from  traditional  records  in  a  number  of  ways.  In  .addition  to  normal  term  liehls. 
they  can  contain  constructor  fields.  The  type  or  content  of  later  Helds  in  a  translucent  sum  can  dt'pend  on 
the  content  of  earlier  fields.  As  an  example,  consider  the  following  translucent  sum,  call  it  P,  t  hat  pack.ages 
up  a  type  with  a  value  of  that  typer' 

{b>0'=lnl,  yc>r=.{(  (bcxvrSl, 

Binding  r  to  P  would  give  r.b  ::  Q  and  r.y  :  rb,"’ 


'"The  arrow-typ^  -4  — .4^  ran  be  regarfird  as  an  abbreviation  for  llx:/!.,!*  where  r  is  nut  free  in  .4'. 

®  Abbreviated  example:  (((Ax.  jr.y)A/).J)  ((( Ax.  x.y )A/).l )  where  M  i.s  the  example  in  seriiun  l.(>. 

'  To  see  this  consider  the  outermost  d-reflex  of  (Ax.(x.y.2)  (j  .y  l ))  A/  where  .\!  is  the  example  in  sertit>ii  !.(>. 

^It  is  not  clear  that  allowing  general  terms  in  miistnictors  wouhl  be  that  \is«*ful  anyway  siin  e  the  >ubst itiit l»»n  of  general 
terms  for  term  variables  would  be  prohibited  in  a  <;aJI-by-vahie  setting  in  any  r.-use. 

^We  suggest  pronouncing  as  “as".  ■*:"  as  “has  type",  and  as  “has  kind". 

^®Note  the  distinction  between  r  and  P  here:  r  is  a  term  variable  (and  hence  a  value)  whih*  is  a  lerni  me(,i-varial>le 
denoting  a  non- value.  This  flistinrtif>n  is  important  l>ecause  the  typing  niles  treat  values  sperially. 


The  scope  of  variables  bound  in  bindings  and  declarations  is  all  the  following  bindings/declaraiions  in 
that  translucent  sum  (type).  For  example,  the  scope  of  x  in  the  following  translucent  sum  includes  .\/'  and 
A'  but  not  M  or  .4: 

{bi>a=d,  yt>x=A'/,  y'  >  x'=M' .  b'c>a'=,-l'} 

Scoping  for  the  other  constructs  is  as  normal.  VVe  regard  terms,  constructors,  etc.,  that  differ  only  by 
a-conversion  of  variables  as  equivalent. 

Note  that  field  names  cannot  be  o-converted.  Changing  a  field  name  in  a  translucent  sum  term/type 
results  in  a  different  term/type  because  the  set  of  legal  field  names  which  can  be  selected  changes.  Failure 
to  distinguish  between  field  names  which  cannot  be  <>-converted  and  the  internal  names  for  fields  which 
must  be  able  to  a-convert  in  order  to  permit  substitution  to  work,  leads  to  problems."  For  example,  the 
equivalent  of  the  following  cannot  be  written  straightforwardly  in  SML:‘- 

{bc>c»=lnt,  y  c>  r  =  {b  c>  (>'=:Booi,  y'oj''  =  Ar:o'.  1)} 

{bc>ar::Q,  y  t>  r:{b  o  y' t>  x’:q' — a}} 

Because  SML  does  not  distinguish  the  two  kinds  of  names,  it  is  problematic  to  express  that  the  type  of  the 
y'  field  depends  on  both  the  outer  and  inner  b  fields. 

The  field  names  of  any  given  translucent  sum  (type)  are  required  to  be  distinct.  Translucent  sum 
types  that  differ  only  in  that  their  declarations  have  been  reordered  without  violating  any  dependences  are 
considered  equivalent.  For  example,  the  following  first  two  types  are  equivalent  but  both  are  different  from 
the  third  type: 

{bc>o::fl,  bi  o  oi b2>02::Q=cy} 

{bc>a::n,  bi  o  bi  &  Oi  ::f2=ar } 

{bi  t>  ::ff=a,  bc>o;:f7,  b2  >  f»2:d^='> } 

It  may  help  to  think  of  translucent  sum  types  as  being  directed  acyclic  graphs  (DACs)  where  the  nodes  are 
declarations  and  the  edges  are  dependencies  by  one  declaration  on  a  variable  ileclared  in  .mother  declaration. 

It  is  possible  to  include  information  in  a  translucent  sum  type  on  the  contents  of  ’he  constructor  fields 
of  its  instances.  This  ability  can  be  used  to  give  a  more  expressive  type  to  P: 

{bc>o=lnt,  yt>j:=.J}  {bt>  0';:$2=lnt.  yorto} 

If  it  can  be  shown  that  r  h^ls  this  type,  then  it  can  be  inferred  that  r.b  =  Int.  This  can  not  be  iid'erred  if  it 
can  only  be  shown  that  r  has  the  less  expressive  type.  The  use  of  nested  translucent  sums  and  constructor 
field  component  information  can  give  rise  to  more  complex  dependencies  as  the  following  example  illustrates: 

{y  >r={b>o'=lnt},  y' t>  bc>f»=x.b} 

|y  o  x:{b  o  0'::n},  y'c>x':r.b.  b  t>  <>::n=x.b} 

3.1  Introduction  and  elimination  rules 

The  introduction  rule  for  translucent  sums  is  as  follows: 

r  valid 

Vi€  [l..»]-  . B,  :  D,  ITSI'M) 

rt-  {fl, . B„]  :  (D, . £>„} 

(The  overline  function  (D)  merely  strips  off  the  field  name.)  Note  that  each  of  the  bindings  is  being  type 
checked  under  a  context  which  takes  account  of  the  effect  of  all  of  the  previous  bindings.  Constructor 
bindings  result  in  transparent  definitions,  both  when  type  checking  later  bindings  an<l  in  tin'  n'snlting  type. 


"To  avoid  verbosity,  a  real  prognamming  language  based  on  .nir  .system  would  prob.ably  provide  I  hat  by  delaiilt  only  one 
neime  need  be  given  per  field,  to  be  used  as  both  the  field  name  and  the  internal  name. 

It  is  possible  to  write  this  in  .SML  by  using  a  combination  of  local  specificatitms  and  type  sharing  In  I  he  signature,  thanks 
to  .Mads  Tofte  for  pointing  this  out.  Unfortunately,  however,  some  SML  implementations  SML/N.I)  elo  not  implemenl 

local  specifications  in  signatures  properly  so  this  is  not  very  helpful  in  practice. 


t 


Thus,  the  introduction  rule  gives  P  the  more  expressive  type.  The  less-expressive  type  is  obtained  by  tii^ 
use  of  the  subsumption  rule  afterwards. 

There  are  two  elimination  forms  with  corresponding  rules  for  translucent  sums,  one  for  constructor  fields 
and  one  for  term  fields: 


r  t-  V  :  {b^a::A■} 
r  t-  V  .b  :  :  A' 


(C-EXT-O) 


r  I-  A/  :  {y  >  r:d} 
r  h  .V/.y  :  .4 


(EXT-V) 


In  order  to  appl>  these  rules  to  translucent  .sums  with  multiple  fields,  it  is  first  necessary  to  use  the 
subsumption  rule  to  drop  the  fields  that  are  not  being  selected.  The  constructor  field  case  may  also  require 
that  type  information  about  the  field  to  be  selected  be  dropped  I'nlike  traditional  records,  with  translucent 
sums  it  is  not  always  possible  to  drop  all  the  other  fields  because  the  field  we  wish  to  select  may  depend  in 
an  essential  way  on  them.  Thus,  the  fact  that  M  has  a  translucent  sum  type  with  a  y  field  is  not  in  itself 
sufficient  to  ensure  that  A/.y  is  well-typed. It  is  always  possible  to  drop  fields  from  the  type  of  V  because 
of  the  VALUE  rules  which  ve  will  discuss  in  section  3.3. 


3.2  Translucency 

When  appears  in  the  context  where  .-I  is  a  translucent  sum  type  which  contains  information  about  tiie 
contents  of  the  constructor  fields  of  its  instances,  it  gives  rise  to  equations  via  the  following  rule: 


r  H  V  :  {boo::A'=A} 
r  E  V.b  =  A  ::  A' 


'ABBREV) 


Thus  it  is  possible  to  infer  that  r.b  =  Int  when  it  can  be  shown  that  r  has  the  type  ( {bo  o::f}=lnt.  y  o  J'.tt} ) 
but  not  when  it  can  only  be  shown  to  have  the  type  ({boQ::^,  y>  j;:a}). 

This  rule  also  gives  rise  to  equations  such  cis  {boa=lnt.  b'o(»'=Bool}.b'  =  Bool.  These  e(|uatioiis  allow 
any  valid  constructor  V'.b  to  be  reduced  to  a  constructor  which  contains  only  values  of  the  form  j-.y,  ■  .y„ 

(n  >  0).  Because  of  this,  it  is  not  necessary  in  our  system  to  consider  the  equality  of  arbitrary  values  (and 
hence  terms)  at  type-checking  time. 

When  the  equality  rules  compare  the  parts  of  a  constructor  that  are  in  the  scope  of  a  variable  binding, 
they  do  so  with  the  declaration  associated  with  that  variable  in  the  context.  For  example,  the  equality  rule 
for  O-types  below  compares  .-I'l  and  .do  with  x:.l|  in  the  context: 


r  h  .4,  =  .42  :;  « 
r.  j:.4i  I-  .4;  =  ,4;  ::  Q 
Fh  nz:.4,.A;  =  nj:,42..42  ::Si 


(E-DFIX) 


This  allows  use  of  the  ABBREV  rule  to  obtain  equations  such  as  the  following: 

nj::{bt>  0':;n=lnt}.  r.b  =  nr:{b>(>:  Q— Ini)  bit 

(b  c- a’:;n=Fnt,  y  >  rro)  =  {bt>  <»;:S}=lnl.  y  >  r:lnt} 

A  similar  effect  occurs  while  typing  terms.  For  example,  in  the  following,  wc  know  t  hat  j-.b  =  Int  while 
type  checking  M: 

(y  &  r;|bc>  (v:  ^=lnt},  y'[>r':.U} 

Because  of  the  ability  to  substitute  away  transparently  bound  names  using  the  equality  riih's,  no  dependency 
on  a  transparently  bound  name  is  ever  truly  essential.  This  allows  many  more  field  seli-ctions  and  function 
applications  to  type  check  than  would  oth-'rwise  be  the  c.ase. 

When  translucent  sums  are  given  fully  opaque  types,  they  act  like  weak  sums  which  can  be  used  to  create' 
abstract  data  types  (ADTs)  [35].  Because  we  have  dependent  functions  and  a  form  of  dependent  pairs  (a 
pair  of  terms  where  the  type  of  the  second  term  depends  on  the  first  component  of  the  |>air).  our  I'liminat  ion 
form  for  weak  sums  is  more  powerful  than  usual  [35,  7,  10].  Consider  the  fallowing  example  in  SML-like 
notation,  where  treaJc  is  used  to  construct  a  weak  sum; 


Fur  examplfi,  ( (b  a'=Int.  y  >  2:=.4}  :  { b  t>  y  t>  }  ).y  is  not  wt41-lyp<'<l. 
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let  stnictore  S  =  struct 
structure  Stack  =  aeeik 
type  T  =  (int  rel)  list 
val  makeStack: ()->T  =  ... 
val  push: (int. T)->()  =  ... 

end 

val  myStack  =  Stack. raakeStackO 
end  in 

S. Stack. pushd  .S.myStack) 
end 

This  e.xample  is  well-typed  in  our  system  because  we  can  determine  that  S.myStack  bas  type  S.  Stack.  T 
which  is  the  argument  type  of  the  function  S. Stack. push.  Note  that  there  is  no  way  to  type  this  example 
using  the  open  elimination  form  for  weak  sums  because  there  is  no  .scope  containing  both  the  initialization 
of  myStack  and  its  use  that  is  also  inside  the  scope  of  Stack. 


3.3  The  VALUE  rules 

Suppose  the  typing  context  contains  the  following  declaration: 

r:(bt>  y  >  x:f»} 


What  types  can  we  give  to  the  expression  r  under  this  context'.’  Because  we  have  a  name.  /•.  for  the 
translucent  sum  expression  we  are  trying  to  type,  we  have  a  name  for  the  contents  of  its  b  held,  namely  f  b. 
This  suggests  that  we  can  give  r  the  type  {bo  a::;j=r.b.  y  o  r:r.b}  which  is  a  subtype  of  the  context  t>|)e  for 
r. 

This  tech'  ique  of  giving  a  more  expressive  type  to  translucent  sums  when  we  have  a  name  fc  r  their 
constructor  components  can  be  generalized  to  work  on  arbitrary  translucent  sum  values.  The  name  in  this 
c;ise  is  simply  V'.b  where  V  is  the  value  in  question,  .\ttempting  to  extend  the  lechniqiu'  to  giuieral  tern  s 
requires  dropping  the  restriction  that  only  values  may  appear  in  constructors  aiul  results  in  unsoundness  in 
the  presence  of  effects. '■*  The  following  two  typing  rules  implement  this  technic|ue: 


r  t-  t'  :  (b  o  a::I\ .  D\ . Dn  } 

Ph  V':  {ho<iy.K=V.b.  D, . D„] 


(V,'iLIT:-0| 


r  I-  V  .y  :  .1' 

rt-  f  :  {yfi.A.D, . £>„} 

r  H  V'  ;  {y  >  D\ . D„} 


(V.\H  K-\') 


(The  VALUE- V  rule  is  used  in  ca.ses  of  nested  translucent  sums  to  apply  the  ti'chnique  recursively.) 

By  alternately  applying  the  VALUE  rules  to  conv<Tt  an  opaque  binding  into  a  transpari'iit  one  and  the 
subsumption  rule  to  propagate  that  definition  (and  hence  removing  any  depfuidencies  on  that  binding),  we 
can  give  any  translucent  sum  value  a  fully  tratisparent  type  (there  are  no  constructor  components  for  which 
information  is  lacking)  with  no  dependencies  lietween  the  fields  (or  subhelds).  Becau.se  of  this,  fit'ld  .seh'ction 
on  values  as  well  as  applications  of  functions  to  values  do  not  run  imo  problems  due  to  the  inability  to 
remove  dependencies.  Without  this  kind  of  usage  of  the  VALUE  rules,  I'xpressions  such  as  r.y  would  not 
type  check. 

The  more  expressive  type  given  by  the  VALUE  rules  to  transluc<-nt  sum  values  is  also  critical  to  th<' 
propagation  of  typing  information.  For  example,  if  .s  is  bound  to  the  restilt  of  the  expression  r.  it  will  be 
given  the  more  expressive  type,  allowing  the  fact  that  .s.b  =  r.b  to  be  inferri'd. 


'^This  would  allow  field  .selection  to  always  succeed  because  it  would  permit  all  ilepeiideiicifs  In  In’  |■<•ml>v<■d. 
extunple:  (M.y.2)  (Af  .y.l )  where  \i  is  the  example  ie  sei:tion  I.K. 


t  aisoiiiidness 
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4  Selected  Examples 

4.1  Simple  structures 

Typical  SML  structures  can  be  translated  straightforwardly  into  our  system,  witn  liie  only  complication 
being  the  treatment  of  polymorphism  (as  discussed  in  [16].)  Consider  the  following  structure  S  coiisidereil 
in  the  introduction; 


structurs  S  =  struct 
type  t  *  int 
t3rpe  u  =  t  ->  t 
val  1  =  fn  x:t  =>  x 
end 

This  translates  as: 

S  =  {t  >  t=lnt,  u&  u=t — t,  f>  f=Ax:t.  x} 

The  translations  of  the  signatures  considered  earlier  (only  SIG  here  is  actually  a  valid  .S.VIL  signature)  are: 


FULL^IG^ 

PARTIAL^IG^.l 

PARTIAL^IG-S-2 

tTG 


(t  t>  t;:ft=Int,  ue>u:;f2=lnt — Int. 
f  >  f  :Int— Int) 

(t  o  t;:Q,  ut>  u;;n=t— t,  f  >  f  :u} 
(t  t>  t;:ft=Int,  u  c>  u::0,  t  >  f  ;u) 

(t  o  ut>  u;;f2,  f  >  f  tu) 


The  subtyping  rules  for  our  system  establish  that  FULL-SIG_S  <  PARTIAL-SIG.3-1  <  SIG  and  FULL-SIG-S  < 
PARTIAL-SIG_SJ2  <  SIG.  The  signatures  PARTIAL.SIG-S-1  and  PARTIAL-SIG-S-2  are  incomparable. 

The  signature  given  to  S  determines  which  equations  on  S.t  and  S.u  can  lie  deduced.  By  tlefault  our 
system,  like  SML,  will  give  S  its  full  signature,  FULL_SIG-S.  This  means  that  we  will  be  able  to  deduce  that 
S.t  =  Int  and  S.u  =  Int— Int.  If  we  insert  a  coercion  to  one  of  the  other  signatures  before  the  binding  to  S. 
fewer  equations  will  be  deducable  in  our  system.  In  SML,  by  contrast,  user-specified  coercions  nev('r  result 
in  the  loss  of  typing  information.  They  can.  however,  result  in  the  loss  of  fields.  Thus,  in  order  to  translate 
a  coercion  from  SML  into  our  system,  we  need  to  enrich  the  target  signature  with  all  the  available  typing 
information. 


4.2  Abstraction 

SML/NJ  [2]  supports  an  extension  to  SML,  called  abslractton.  which  is  an  alternative  to  the  normal  strticture 
binding  mechanism.  If  the  keyword  abstraction  is  used  instead  of  the  keyword  structure  when  iunding 
a  structure,  all  information  about  the  constructor  compoiient.s  of  that  structure  is  forgotten.  Had  S  in  the 
previous  example  been  bound  with  an  abstraction  binding  instead  of  the  structure  one  we  u.sed,  it  would 
have  been  as  if  we  had  given  S  in  our  system  the  signature  SIG.  That  is  to  .say.  S.t  and  S.u  would  liave  been 
bound  opaquely.  .Note  that  it  is  not  po.ssible  in  .SML/N.I  to  give  S  a  partial  signature  using  this  mechanism. 
Only  the  fully  transparent  (via  structure)  and  fully  opaque  (via  abstraction)  alternatives  are  available. 

Abstraction  bindings  can  be  translated  into  our  system  by  inserting  a  forceil  coercion  just  befon'  the 
binding  to  the  appropriate  opaque  type.  For  example,  consider  the  following  implementation  of  an  abstract 
data  type  (ADT): 

abstraction  Stack  =  struct 
type  T  =  (int  rel)  list 
val  push: (T, int) ->()  =  ... 
val  pop:T->int  =  ... 
val  isEnpty : T->bool  =  ... 
end 
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This  translates  to: 


Stack  =  ({  To  T=list(ref  Int), 

pusho  puah=(. .  .):(T,  Int)— (), 
popo  pop=(. .  .):T— Int, 
isEaptyo  isE«pty=(. .  ,):T— Bool } 

)  :  {  ToT::ft, 

puaho  puah:(T,  Int)— ( ), 
popo  pop:T — Int, 
isEmptyo  isEapty:? — Bool  } 

Note  that  because  the  type  information  about  the  identity  of  the  T  field  is  lost  in  the  coercion,  the  rest  of 
the  program  will  be  unable  to  break  the  abstraction.  SML  provides  an  abstraction  mechanisr'’!,  abstype.  at 
the  core  language  level.  Because  translucent  sums  are  first-class  in  our  system,  we  can  achieve  the  effect  of 
SML’s  abstype  using  the  abstraction  binding  mechanism. 

4.3  Sub-structures 

Sub-structures  are  also  easily  translated.  For  example,  suppose  we  wanted  to  use  the  Stack  structure  in  a 
bigger  structure  as  follows: 

structure  Big  =  struct 

structure  ourStaclr  =  Stack 
type  T  =  ourStack.T 

end 

This  translates  into: 

Big  =  {ourStackc>  ourStack=Stack,  T  >  T=ourStack.T,  ...} 

Big  will  be  given  the  following  full  signature: 

{ourStacko  ourStack:{  Te>  T:.n=Stack.T. 

push  >  push:(T,  Int) — ( ), 
pops-  pop:T — Int. 
isE>pty>  isEnptyiT — Bool  }, 

T t> T:;n=ourStack.T,  ...} 

Note  that  we  have  that  Big. ourStack.T  =  Big.T  =  Stack.!. 

4.4  Functors 

Functors  translate  into  dependent  functions  in  the  expected  way.  Consider  the  following  example  from  the 
introduction: 

functor  F(structure  X:SIG):SIG  =  struct 
type  t  =  X.t  *  X.t 
type  u  =  X.u 
val  f  =  X.f 
end 

This  translates  into: 

F  =  AX:SIG.  (  {t  s  t=X.t  ♦  X.t,  ut>u=X.u.  f  c>  f=X.f }  : 

{t  c>t::n=X.t  »  X.t,  u>u::$]=X.u,  f  >  f;u}) 

(The  coercion  on  the  result  type  of  the  functor  is  an  abbreviation  for  a  coercion  on  the  functor  body.)  .Note 
the  enriched  signature  we  have  to  give  instead  of  SIG  in  order  to  make  the  coercion  liav('  the  same  effect  .as 
it  does  in  SML.  Translating  the  functor  signatures  we  considered  for  F  gives: 

FULL^IG-F  =  nX:SIG.  {t  t>  t::n=X.t  »  X.t,  u  c>  u::Sl=X.u, 
f  i>  f  :u} 

PARTIAL_SIG.F  =  nX:SIG.  {te-t;:n,  ufu::n=X.u,  f  cf:u} 

SIG  =  nX-.SIG.  SIG 
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FULL^IGJf 

=  nX:SIG.  {t  c>  t:;f2=X.t  «  X.t.  ut>u:;n=X.u,  1 1>  f:u} 

<  nX;FULL-SIG-S.  (t  >  t::fi=X.t  *  X.t,  ue.u;:Q=X.u,  f  (>f:u) 

=  nX:{t  >  t:;n=Int,  uo  u::f2=lnt— Int,  f  >  f  :lnt — Int}.  {t  >  t::n=X.t  •  X.t,  u>  u::n=X.u.  f  >  f  :u} 

=  nX:{t  t>  t::i]=Int,  uc>  u;;f2=[nt— Int,  f  o  f  :Int — Int}.  {t  t>  t::n=Iiit  ♦  Int,  u>  u;;f2=Int— Int.  f  t>  f  .uj 
=  FULL-SIGJS — {t  >  t::f2=Int  *  Int,  uc>  u:;f2=Int — Int,  f  >  f  :u) 

Figure  4:  Steps  in  coercing  F’s  type  to  an  arrow  type 

Here,  FULL_SIG_F  <  PARTIAL_SIG  J  <  SIG_F. 

Suppose  T  was  bound  to  the  result  of  applying  F  to  S.  Before  the  APP  rule  can  be  applied  to  determine 
T’s  type,  the  subsumption  rule  must  be  used  to  coerce  F’s  type  (FULL-SIGJ)  to  an  arrow  type.  One  way 
this  could  be  done  is  shown  in  Figure  4.  First,  F’s  argument  type  is  coerced  to  a  subtype  (remember  that 
FULL-SIG-S  <  SIG)  using  the  fact  that  subtyping  of  function  types  is  contra- variant.  .\e.nt.  the  eijuality  rules 
are  used  to  remove  the  dependencies  on  X  by  the  result  type,  resulting  in  an  arrow  type.  The  result  is  that 
T  gets  eissigned  the  following  type: 

{t  &  t:;n=Int  *  Int,  ut>u;:ft=Int — Int,  f  >  f  ;u} 

If  F  had  instead  had  the  type  PARTIAL_SIG-F,  T  would  have  been  assigned  the  type: 

{t  &  t::n,  u  >  u::ft=Int — Int,  f  t>  f  :u} 

4.5  Sharing  specifications 

The  basic  idea  in  translating  sharing  specifications  is  that  for  each  set  of  names  that  are  a.s,serted  to  be  equal, 
pick  one  with  maximal  scope  as  representative  of  the  equivalence  class  and  set  the  others  equal  to  it  using 
transparent  definitions.  For  example,  the  following  SML  signature: 

signature  H  =  sig 
type  t 
type  u 
type  V 

sharing  type  t  =  u 
and  tyrpe  v  =  u 

end 

translates  into: 

H  =  { t  o  t::f2,  u  o  u::li=t.  v  c>  v:;n=t } 

A  more  interesting  case  is  provided  by  the  argument  signature  of  the  Parser  futictor  in  MacQiu'en's  exatiqtle 
from  the  introduction; 

sig  structure  L: LEXER 
structure  T:SyMTAB 
3h2a:ing  type  L .  S .  syinbol=T .  S .  symbol 
end 

This  translates  into: 

{Lt>  [.:LEXER,  Tc>  T;{S  >  S:{3y»bolt>  3y»bol::Ji=L.S. symbol, 

...}} 

The  omitted  parts  are  the  usual  translation  of  the  rest  of  SYMTAB  and  SYMBOL.  This  1  ranslation  nu'thod  also 
works  on  sharing  between  constructors  in  the  argument  and  result  of  a  functor. 
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4.6  First-class  modules 


So  long  as  we  restrict  ourselves  to  simple  module  operations  like  binding,  functor  application  of  a  named 
functor  to  a  named  or  fully  transparent  module,  and  selection  from  a  named  module,  we  never  lose  any 
typing  information.  In  fact,  the  only  module  operation  available  in  SML  that  causes  a  loss  of  information 
when  used  in  our  system  is  coercing  a  module  to  a  user-specified  type.  This  is  not  surprising,  however,  since 
the  purpose  of  coercions  is  controlled  information  loss. 

Due  to  the  fact  that  modules  are  first-class  in  our  system,  it  is  possible  to  write  module  e,\pressions  which 
force  the  loss  of  typing  information  in  order  to  preserve  soundness.  For  example,  consider  the  following:^^ 

if  flip!)  then  (b>ft=Inl,  yc>r=(3.  succ)} 
else  (b  0  a=Bool,  y  >  x=(  true,  not ) } 

While  both  parts  of  the  if  can  be  given  fully  transparent  types,  these  types  are  not  equal.  In  order  to  make 
the  if  type  check,  we  must  give  them  equal  types.  The  only  way  to  do  this  is  to  use  the  subsumption  rule 
to  coerce  both  of  their  types  to  {boo:;n.  yi>x:(nf,  a—  «)}. 

The  system  described  in  [34]  displays  similar  behavior,  namely  a  forced  loss  of  typing  information  when 
using  modules  in  conditionals  and  other  primitives.  In  that  system,  types  are  divided  into  two  universes. 
Ui,  the  universe  of  “normal”  types  like  Int  and  Bool — Int,  and  tw,  the  universe  of  module  types.  The  loss  in 
this  system  is  caused  by  the  need  to  apply  an  implicit  coercion  from  a  strong  sum  (which  belongs  to  I'n)  to 
a  weak  sum  (which  belongs  to  Ui)  because  primitives  operate  only  on  terms  with  types  in  L\ .  This  coercion 
causes  a  total  loss  of  typing  information.  Our  system  is  more  flexible  than  this  becau.se  it  only  loses  just 
enough  information  to  ensure  soundness. 

The  possible  uses  for  first-class  modules  have  not  been  well  explored.  One  known  use  discussed  in  [fll)]  is  to 
select  at  runtime  between  two  or  more  ADTs  which  implement  the  same  abstraction  using  different  algorithms 
based  on  expected  usage  conditions  For  example,  we  could  use  one  particular  hash  table  implementation 
for  small  tables  and  another  for  large  ones. 

5  Related  Work 

An  early  influential  attempt  to  give  a  comprehensive  type-theoretic  analysis  of  modularity  aiul  abstraction 
was  undertaken  by  Burstall  and  Lampson  with  the  experimental  language  Pebble  [a].  Their  work  stresses 
the  role  of  dependent  types  and  the  mechanisms  required  to  support  abstraction,  but  does  not  aildress  the 
problem  of  controlling  the  “degree”  of  abstraction.  In  particular.  Pebble  supports  typi*  and  value  Iniulings 
as  primitive  notions,  but  with  an  “opaque"  typing  discipline,  in  contrast  to  our  calculus. 

Cardelli’s  language  Quest  [7]  has  e.xerted  a  strong  influence  on  the  present  work.  Our  approach  shares 
with  Quest  the  emphasis  on  type-theoretic  methods,  an<l  is  similarly  based  on  Oirard  s  enriched  with 
a  notion  of  subsumption  (though  we  depart  from  (.’ardelli's  approach  by  omitting  bounded  quantification). 
Quest  does  not  provide  an  adequate  treatment  of  modularity;  our  work  can  be  seen  as  providing  the  type- 
theoretic  batsis  for  an  extension  of  Quest  with  an  expressive  inotlule  .system. 

Mitchell,  et  al.  [34]  consider  an  extension  of  the  SML  module  system  with  first-class  mo<lules  as  a  means  of 
supporting  certain  object-oriented  programming  idioms.  Their  paper  is  jtrimarily  concerni'd  with  illusi  rating 
an  interesting  language  design  rather  than  with  the  type-theoretic  underpinnings  of  such  a  language,  though 
a  brief  sketch  is  provided.  A  comparison  with  their  work  is  given  in  Section  4.(5. 

The  type-theoretic  analysis  of  the  SML  modules  system  was  initiated  by  MacQueen  [27],  and  further 
developed  by  Harper  and  Mitchell  [33,  20,  19],  This  work  is  summarized  and  compared  with  the  present 
work  in  the  introduction. 

Our  language  bears  some  relationship  to  Russell  [4]  and  Poly  [30],  but  a  detailed  comparison  seimis 
difficult  in  the  absence  of  a  type-theoretic  analy.sis  of  tln'se  languages  (see  [21]  for  an  <'arly  attempl ). 

In  an  effort  to  address  the  problem  of  separate  compil.atiou,  I.eroy  has  indep<Mid(’ntly  (h'vi'loped  ;i  variant 
of  the  SML  modules  system  based  on  the  notion  of  a  “manifi'st  type"  which  is  similar  in  spirit  to  mir 


’^For  the  unsoundness  examples,  flip  is  a  function  which  alleniales  n’liiniiiig  true  aiiil  false.  It  is  c.a.sily  iiiipiciiiciili-d 
using  a  global  variable. 
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translucent  sum  types.  See  Leroy’s  paper  [23]  for  a  description  of  his  system  and  some  comments  on  its 
relationship  to  ours. 


6  Conclusions 

The  main  contribution  of  this  work  is  the  design  of  a  calculus  of  modularity  with  the  following  features: 

•  Fine  control  over  the  "degree”  of  abstraction  through  the  notion  of  a  translucent  sum  type. 

•  A  treatment  of  modules  as  first-class  entities  without  sacrificing  the  control  over  type  abstraction 
afforded  by  a  second-class  module  system. 

•  Support  for  separate  compilation  in  a  form  that  ensures  the  complete  equivalence  between  .separate 
and  integrated  compilation  of  a  large  system. 

The  following  are  some  important  directions  for  future  research: 

•  Establish  the  soundness  of  the  type  system  by  proving  preservation  of  typing  under  a  call-l)y-value 
operational  semantics. 

•  Investigate  the  efficiency  of  type  checking  and  develop  practical  algoritliins  that  may  be  used  in  an 
implementation.  We  show  in  Appendix  B  that  the  subtyping  problem  for  our  system,  and  hence  the 
type  checking  problem,  is  undecidable.  There  is  reason  to  believe,  however,  that  this  will  not  be  a 
problem  in  practice. 

•  Design  an  elaborator  to  translate  an  SML-like  syntax  into  the  calculus,  including  a  systematic  treat¬ 
ment  of  the  reduction  of  symmetric  sharing  specifications  to  asymmetric  definitions  in  signatures. 

•  Develop  an  treatment  of  structure  sharing  that  accounts  for  structure  generativity  anil  interacts  well 
with  computational  effects. 
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A  The  Typing  Rules 

DeRnition  A.l  (Judgements) 

h  r  valid  valid  context 

r  H  ,4  ::  A'  valid  constructor 

r  h  .4  =  .4'  ::  A'  ee/ual  constructors 
r  H  D  =  ZJ'  equal  declarations 

r  H  ,4  <  ,4’  subtype  lelation 

r  D  <  D’  subfield  lelation 

r  U  .M  :  ,4  ivell-lyj>ed  term 

V  B  .  D  well-typed  binding 

Deflnition  A.2  (The  field  name  stripping  function) 

bc>o;:A'  =  o-A’ 

boo::A'=A  =  Of::l\=.\ 

y!>x.:A  =  r:.\ 

Definition  A. 3  (Context  Formation  Rules) 

f-  •  valid  (IN'Un.AL-  I  ) 

I-  U  valid  n  ^  dom(  U ) 

1-  r.f»::A'  valid 
h  r,  0'::A  valid  T  h-  ,4  A 
h  r,0':;A'=/l  valid 
r  h  .4  ;;  fl  X  ^  dom(  F) 

I-  r,x;/l  valid 


(DEU-O) 

(DEU-'D 

(DEE-V) 
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Dennition  A.4  (Constructor  Formation  Rules) 

H  r  valid  a::K  €  T 
r  h  a  ::  A' 

I-  r  valid  a::K=A  €  T 
n-  a  ::  A' 
r  t-  V'  :  {bs  a:;A'} 
r  h  V.h  ::  A' 

ri-  ::n 

h  r  valid 

r,Dh  {Di _ D„]  ;:n 

r  I-  . Dn]  ::  n 

r,a::A  t-  .4  ::  A' 
r  t-  \q::K.A  ::  A’^A' 
rHAi::A2^A  ri”  A2  h  2 

r  H  /ti  A2 "  A 

Definition  A.5  (Constructor  Equality  Rules) 

r  I-  A  ::  A' 
r  I-  A  =  A  ::  A' 

r  h  /•'  =  A  ::  A' 

FT  1  =  A'  ::  A' 

r  h  A  =  A'  A  f  h  A'  =  A"  ::  A 
.  r  I-  A  =  A"  ::  A 

r  h  A,  =  A2 ::  n 
r..r:Ai  ha;  =  A^  ::  Q 
r  h  ni:A,.  a;  =  ni:A2.  a^  n 
r  I-  £)  =  £>' 

r.Ph  (Pi . Dn}  =  {Pi . p;,)  ■:« 

rh  . Dr,}  =  {D'.D\ . D'„}  ::J1 

r,  a;:A  h  A  =  A*  A  ’ 
r  H  Xa\:l\.  A  =  Aa:;A.  A'  ::  I\=>  l\' 

r  I-  A2  =  A2  ::  A 
r  h  A,  =  a;  ::  A=>  A" 
r  I-  A,  A2  =  A',  A'2  ::  A' 
r,a;:AhA::A"  T  h  A' ::  A 
r  h  (Aa:;  A.  A)  A'  =  [A'/ajA  ::  A' 
r,  a::  A'  h  A  a  ::  A  '  P  i-  A  A’^  A’’ 

P  h  Ao::A',  Aa  =  A  ::  K=>K' 

P  t-  V  :  {bt>a::A=A} 

P  h  K.b  =  A  A 
t-  P  valid  a;:A’=A  €  P 


(C-VAR-O) 

(C-VAR-T) 

(C-EXT-O) 

(C-DFUNM 

(C-UNIT) 

(C-TSUM) 

(C'-LAM) 

(C-APP) 

(E-REFL) 

(E-SYM) 

(E-TRAN) 


(E-DFUN) 


(E-TSUM) 


(E-LAM) 


(E- APP) 


(E-BETA) 


(E-ETA) 

(ABBREV) 


Ph  a  =  A  ::  A 


(ABBREV) 


Deflnition  A.6  (Declaration  Equality  Rules) 

h  r,a;:A‘  valtd 
r  1-  ht>a::h'  =  h>a::K 

r  H  A  =  A’  h  r,a;:A  =/l  valid 
r  h  b  ^  a.:K=A  =  b  c>  a::h'=A' 

r  t-  /I  =:  A'  ::  n  H-  r,  i-.A  valid 
r  h  y  t>  x:A  =  y  t>  x:A' 

Definition  A. 7  (Subtyping  Rules) 

r  t-  .4  =  .4'  ::  n 
r  h  ,4  <  A' 

r  h  ,4  <  .4'  r  (-  ,4'  <  .4" 
r  h  ,4  <  .4“ 

r  (-  ,42  <  .4i 

r,  j:,42  t-  r  I-  nj:Ai .  ,4'i  ::  Q 

r  I-  rirrA) .  .4'i  <  nj:A2-  -42 

r^D<D‘  Vh  {D'.D\,....D'„]  -.Q 

r.D^-{Di . Dn)  <  \D\ . D'„,) 

rh  . D„)  <  {D\D\....,D'm} 

rt-  {Di . Dn,D)  ::Q 

ri-  {O, . Dn.D}  <  [Dx . Dr,} 

Definition  A. 8  (Subfielding  Rules) 

r  (-  D  =  £>' 

r  I-  £»  <  D' 

«  r  H  ,4  <  /I*  I"  r,x:-4  valid 

r  I-  y  >  x:,4  <y>  x;-4' 

t“  r.a::A’=/l  valid 
r  H  bt>  a-.:K=A  <  h>a::l\ 

Definition  A.9  (Term  Formation  Rules) 

I-  r  valid  x;,4  €  T 
r  I-  X  ;  ,4 


(EQ-O) 

(EQ-T) 

(EQ-V) 

(S-EQ) 

(S-TRAN) 

(S-DFI’N) 

(S-TSl'M) 

(S-THIN) 

(S-SAME) 
(S-VAH'E) 
(S- FOR  GET) 

(VAR-V) 


r.  x:-4  H  M  :  .4' 

Ff-  Ax:-4.iV/  :  nx:-4.  ,1' 
Fh-W,  :A,  — A2  FFAf2:-4i 
F  F  iW,  M2  :  .42  ~ 

h  F  valid 

Vi  €  [l  .n].  F.  P7,  ■  ■  ■ ,  OTT  F  B,  :  P. 
FF  {fli . Br.)  :  {Dx . Dr,] 

F  F  iVf  ;  {y  ^  x:/l} 

F  F  A/.y  ;  ,4 

FF  {b>a::A',  £>i . D„  } 

F  F  K  :  {b  0  a::  A’=F'.b,  Z7i , . . . ,  Dr, } 


(LAM) 

(APP) 

(TSPM) 

(EXT-V) 

(VALIIE-O) 
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^(p) 

=  <  3a,ai, . . .  ,o„. -'(3o'=Qf,  e»'i=F(pi ).  •  •  •  .Q'I.=F(p„). --Flpi )) 

[.  3a,  ori, . . . ,  -■a 

if  p  =  a, 

if  p  =  [oi . an]<Pl  ■  .Pn> 

if  p  =  HALT 

F(R) 

<7 

=  3a=<r,  ai=F(pi ), .  . .  ,a„=F’{p„).  -’F(pi)  <  er 

=  3a,  ai , . . . ,  an.  ■'(3a'=a,  a'i=ai , ...,  a]i  =  an-  ~>a) 

where  R  =  <p\  .  .  .  p„>  and 

Here,  a,  o',  and  oi  through  oj,  are  fresh  variables. 


Figure  5:  Modifications  to  Pierce’s  encoding  of  row  machines 


r  I-  V  .y  ; 

r  h  V  :  {yo  . £>n} 

ri-  V'  :  {yt>i::/l'.D, . D„] 

rh  M  -.A 
rt-  M  .A  :  A 

r  H  M  .  d'  r  h  ^'  <  .d 
FTmTa 

Definition  A. 10  (Binding  Formation  Rules) 

H  r,o::A’=/l  valid 
r  I-  b  t>  a=/l  :  b  t>  q::K=A 

r  I-  iV/  :  ,1  H  r,x:A  valid 
r  h  y  ^  x=M  :  y  c>  x:/! 

Lemma  A. 11  (Properties  of  the  typing  system) 

1.  i/  r  H  A  ::  A'  then  i-  F  valid 

2.  i/  r  1“  A]  =  A2  ::  A  then  F  l~  Ai  A  and  F  h  A2  A 

3.  1/  F  I"  Z?i  =  D2  then  I-  F,  D\  valid  and  h  F,  D2  valid 

4-  if  r  h  Ai  <  A2  then  F  I-  Ai  ::  Q  and  F  I-  A2  ::  Q 

•5.  1/  F  h  £>1  <  D2  then  F,  Di  valid  and  t-  F,  D2  valid 

6.  i/F  h  M  ;  A  then  F  h  A  :: 

7.  ifThB.D  then  h  T,  D  valid 

8.  i/h  r,  D  valid  then  I-  F  valid 

B  Undecidability  of  Subtyping 

The  subtyping  relation  for  our  system  can  be  shown  to  be  iindecidal)le  by  a  sligiit  modification  to  Ben  jamin 
Pierce’s  proof  of  the  undecidability  of  F<  subtyping  [38.  37].  The  basic  source  of  nndecirlability  is  the 
subtyping  rule  (FORGET)  that  allows  the  forgetting  of  information  about  the  type  components  of  translucent 
sums. 

Even  a  Vcistly  simpler  system  with  transparent  and  opaque  sums  and  a  forgetting  rule  is  nndeciilabh'.  In 
order  to  demonstrate  this  as  well  as  simplify  the  discussion,  we  consider  now  a  very  simple  fragment  of  our 
full  system. 


(V.\LUE-V) 

(COERCE) 

(SUBS) 

(BIND-T) 

(BIND-V) 
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B.l  The  fragment 

The  fragment  is  obtained  from  our  system  by  restricting  the  set  of  constructors  to  include  only 

types  and  restricting  the  methods  of  building  types  to  only  allow  for  arrow  types,  binary  opaque  sums  (often 
called  weak  sums),  and  binary  transparent  sums.  We  use  a  slightly  different  notation  to  empha-size  that 
these  are  simpler  constructs.  The  syntax  for  A  ^  is  as  follows: 

Types  T  ::=  a  |  (  3rt.d  j  3a=di  ..l2 

As  before,  the  meta-variable  «  ranges  over  type  variables  and  we  identify  types  that  differ  only  by  o- 
conversion.  The  translation  back  to  our  earlier  notation  is  as  follows: 

a  =  a 

Ai — '.d2  =  llxtAi.Ai 

3a. A  =  {bc>  y  >  x:d} 

3a=di.A2  =  {b t>  a::ft=:Ai ,  y  >  j:.-l2 } 

The  effect  of  the  subtyping  rules  of  our  system  on  this  fragment  is  captured  by  the  following  simple  set 
of  rules: 

Definition  B.l  (Subtyping  rules  for  X~~ 

a  <  n 

A\  <  dl  A2  <  ^2 
.li  — A2  <  A'l  —  Ai 
A  <  A' 

3a. A  <  3a. /l' 

[/l/a]At  <  [.4/a]A2 
3a=;A..li  <  3o=:A.A2 
[■■l/a]Ai  <  [A/a].A2 
3o=/l..li  <  3a. A2 

Note  that  this  set  of  rules  is  completely  syntax  directed  and  does  not  re(|uire  the  use  of  a  context  becau.se 
of  the  explicit  use  of  substitution.  The  proof  that  this  set  of  rules  corresponds  to  the  subtyping  rules  of  tlie 
original  system  on  this  fragment  is  omitted.  For  the  purposes  of  the  undecidabilily  proof,  we  will  only  need 
the  following  lemma: 

Lemma  B.2  The  subtyping  relation  for  A  t.s  reflexive. 

The  proof  proceeds  by  structural  induction  on  the  size  of  the  type  using  the  following  measure:”’ 

|a|  =  0 

|A| — A2I  =  1  +  l/li  I  +  I.I2I 

|3a.A|  =  l+IAl 

|3a  =  /li.A2|  =  1 l[.li/a]A2| 


(VAR) 

(ARROW) 

(SfM-O) 

(SFM-T) 

(FOlUIEl  •) 


B.2  Undecidability  of  subtyping 

Theorem  B.3  If  the  FORGET'  rule  ts  removed,  then  A  ■3-3=  subtyping  is  decidable. 


‘®Note  that  |[Ai/a]A2l  =  l[|Ai  |/a],42l  if  we  define  |i|  =  1. 
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Proof:  Each  use  of  the  other  rules  strictly  decreases  the  following  non-negative  measure,  so  the  simple 
syntax-directed  procedure  always  terminates  in  this  case:  |.4i  <  -dil  =  |.4i|  4-  |.42|  O 


Note  that  use  of  the  FORGET’  rule  does  not  decrease  this  measure  and  in  fact  can  increase  it  because 
the  type  on  the  right  side  can  grow  without  limit  in  the  recursive  call.  This  fact  can  be  used  to  construct 
examples  that  cause  the  simple  syntax-directed  procedure  for  checking  subtyping  to  loop.  For 

example,  consider  the  following  definitions; 

-id  =  .4 — a' 

P{A)  —  3a’=i4.-'d  where  o  fresh 

Gc»(^4)  =  3o.->d. 

The  definition  of  -iT  is  chosen  so  that  -i.di  <  -i.ds  iff  Ts  <  .4|.  Any  type  constructor  with  a  contravariant 
subtyping  rule  could  be  used  here.  An  example  which  causes  cyclic  behavior  is  then  as  follows: 

P(G«(P(«)))  <  Ga(P(o)t 

=  3a=Ga(  jP(« ) )  P((v))  <  3o.-'P(a) 

=»  [Ga(  P(a))/a](-'Ga(P(o)))  v  [Ga(  P(a ))/o](-«P((v) ) 

=  -'Ga(P(a))  <  ->P(Ga(P(o))) 

=>  P(Go(P(a)))  <  Ga(P(o)) 


Theorem  B.4  subtypmg  is  undecidable. 

Pierce’s  proof  can  be  found  in  chapter  b  of  his  thesis  [38]  or  in  [37].  Space  cunsid-^rations  prevent  outlining 
it  here.  The  modifications  necessary  to  change  his  encoding  of  row  machines  so  that  it  produci's  A  ^ 
subtyping  questions  instead  are  found  in  Figure  5.  The  key  differences  are  as  follows: 

•  Use  of  3a=,4..4i  <  3a. Tj  instead  ofVa.,42  <  Vft<.4..4i. 

•  Use  of  reflexivity  to  halt  computation  instead  of  the  FTOP  rule.  (Compare  the  two  delinitions  of 
/■(HALT)) 
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